Denoising genome-wide histone ChIP-seq with convolutional neural networks

نویسندگان

  • Pang Wei Koh
  • Emma Pierson
  • Anshul Kundaje
چکیده

Motivation Chromatin immune-precipitation sequencing (ChIP-seq) experiments are commonly used to obtain genome-wide profiles of histone modifications associated with different types of functional genomic elements. However, the quality of histone ChIP-seq data is affected by many experimental parameters such as the amount of input DNA, antibody specificity, ChIP enrichment and sequencing depth. Making accurate inferences from chromatin profiling experiments that involve diverse experimental parameters is challenging. Results We introduce a convolutional denoising algorithm, Coda, that uses convolutional neural networks to learn a mapping from suboptimal to high-quality histone ChIP-seq data. This overcomes various sources of noise and variability, substantially enhancing and recovering signal when applied to low-quality chromatin profiling datasets across individuals, cell types and species. Our method has the potential to improve data quality at reduced costs. More broadly, this approach-using a high-dimensional discriminative model to encode a generative noise process-is generally applicable to other biological domains where it is easy to generate noisy data but difficult to analytically characterize the noise or underlying data distribution. Availability and implementation https://github.com/kundajelab/coda . Contact [email protected].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome-wide analysis of histone acetylation dynamics during mouse embryonic stem cell neural differentiation

Epigenetic modification as an intrinsic fine-tune program cooperates with key transcription factors to regulate the cell fate determination. The histone acetylation participating in neural differentiation of pluripotent stem cells is expected but not well studied. Here, using acetylated histone H3 ChIP-sequencing (ChIP-seq), we demonstrate that the histone H3 acetylation level is gradually incr...

متن کامل

Genome-wide identification of DNA-protein interactions using chromatin immunoprecipitation coupled with flow cell sequencing (ChIP-Seq)

The transcriptional networks underlying mammalian cell development and function are largely unknown. The recently described use of flow cell sequencing devices in combination with chromatin immunoprecipitation (ChIP-seq) stands to revolutionize the identification of DNA-protein interactions. As such, ChIP-seq is rapidly becoming the method of choice for the genome wide localization of histone m...

متن کامل

Genome-wide identification of DNA-protein interactions using chromatin immunoprecipitation coupled with flow cell sequencing.

The transcriptional networks underlying mammalian cell development and function are largely unknown. The recently described use of flow cell sequencing devices in combination with chromatin immunoprecipitation (ChIP-seq) stands to revolutionize the identification of DNA-protein interactions. As such, ChIP-seq is rapidly becoming the method of choice for the genome-wide localization of histone m...

متن کامل

The histone demethylase KDM3A regulates the transcriptional program of the androgen receptor in prostate cancer cells

The lysine demethylase 3A (KDM3A, JMJD1A or JHDM2A) controls transcriptional networks in a variety of biological processes such as spermatogenesis, metabolism, stem cell activity, and tumor progression. We matched transcriptomic and ChIP-Seq profiles to decipher a genome-wide regulatory network of epigenetic control by KDM3A in prostate cancer cells. ChIP-Seq experiments monitoring histone 3 ly...

متن کامل

Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks.

The complex language of eukaryotic gene expression remains incompletely understood. Despite the importance suggested by many noncoding variants statistically associated with human disease, nearly all such variants have unknown mechanisms. Here, we address this challenge using an approach based on a recent machine learning advance-deep convolutional neural networks (CNNs). We introduce the open ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2017